基于骨架的识别系统正在获得流行,并在骨骼中关注点或关节的机器学习模型已被证明在机器人技术等许多领域具有计算有效和应用。很容易跟踪点,从而保存空间和时间信息,这在抽象所需信息中起着重要作用,分类成为一项容易的任务。在本文中,我们旨在研究这些要点,但使用云机制,在该机制中我们将云定义为点的集合。但是,当我们添加时间信息时,可能不可能检索每个帧中一个点的坐标,而不是专注于单个点,我们可以使用k-neighbors来检索所讨论的观点的状态。我们的重点是使用重量共享收集此类信息,但请确保当我们尝试从邻居那里检索信息时,我们不会随身携带噪音。 LSTM具有长期建模功能,并且可以携带时间和空间信息。在本文中,我们试图总结基于图的手势识别方法。
translated by 谷歌翻译
我们提出了一种用于加速物理信息神经网络(PINN)的新技术:离散训练的PINN(DT-PINNS)。已知在训练过程中通过自动分化在PINN损失函数中重复计算的计算在计算上是昂贵的,尤其是对于高阶导数而言。通过使用无网状径向基函数 - 限制差异(RBF-FD)计算出的高阶精确数值离散化来替换这些精确的空间衍生物来训练DT-PINN,并通过稀疏Matrix矢量乘法应用。 RBF-FD的使用允许在放置在不规则域几何上的点云样品上训练DT细胞。此外,尽管传统的PINN(香草杆)通常在GPU上以32位浮点(FP32)进行存储和培训,但我们表明对于DT-Pinns,GPU上使用FP64导致训练时间比FP32明显更快。香草杆具有可比精度。我们通过一系列实验证明了DT细菌的效率和准确性。首先,我们探讨了网络深度对具有随机权重的神经网络的数值和自动分化的影响,并表明三阶准确度及以上的RBF-FD近似值更有效,同时非常准确。然后,我们将DT-Pinns与线性和非线性泊松方程式上的香草细菌进行比较,并表明DT-Pinns在消费者GPU上更快的训练时间更快,而DT-PINN则获得了相似的损失。最后,我们还证明,可以通过使用RBF-FD离散空间衍生物并使用自动分化来获得PINN溶液(时空问题)的PINN溶液(时空问题)。我们的结果表明,FP64 DT-PINNS为FP32香草细菌提供了优越的成本准确性。
translated by 谷歌翻译
我们对最近的自我和半监督ML技术进行严格的评估,从而利用未标记的数据来改善下游任务绩效,以河床分割的三个遥感任务,陆地覆盖映射和洪水映射。这些方法对于遥感任务特别有价值,因为易于访问未标记的图像,并获得地面真理标签通常可以昂贵。当未标记的图像(标记数据集之外)提供培训时,我们量化性能改进可以对这些遥感分割任务进行期望。我们还设计实验以测试这些技术的有效性,当测试集相对于训练和验证集具有域移位时。
translated by 谷歌翻译
本文展示了alphaRARDEN:一个自治的多种植花园,在1.5米×3.0米的物理测试平台中撒上和灌溉生物植物。alphanArden使用架空相机和传感器来跟踪植物分布和土壤水分。我们模拟个体植物生长和平面动态,以培训选择行动以最大化叶片覆盖和多样性的政策。对于自主修剪,alphanarden使用两个定制的修剪工具和训练有素的神经网络来检测紫杉角。我们为四个60天的花园周期提供了结果。结果表明,alphaRARARDEN可以自主地实现0.96个归一化多样性,在循环峰值期间保持0.86的平均冠层覆盖率。可以在https://github.com/berkeleyautomation/alpharden找到代码,数据集和补充材料。
translated by 谷歌翻译
Embedding words in vector space is a fundamental first step in state-of-the-art natural language processing (NLP). Typical NLP solutions employ pre-defined vector representations to improve generalization by co-locating similar words in vector space. For instance, Word2Vec is a self-supervised predictive model that captures the context of words using a neural network. Similarly, GLoVe is a popular unsupervised model incorporating corpus-wide word co-occurrence statistics. Such word embedding has significantly boosted important NLP tasks, including sentiment analysis, document classification, and machine translation. However, the embeddings are dense floating-point vectors, making them expensive to compute and difficult to interpret. In this paper, we instead propose to represent the semantics of words with a few defining words that are related using propositional logic. To produce such logical embeddings, we introduce a Tsetlin Machine-based autoencoder that learns logical clauses self-supervised. The clauses consist of contextual words like "black," "cup," and "hot" to define other words like "coffee," thus being human-understandable. We evaluate our embedding approach on several intrinsic and extrinsic benchmarks, outperforming GLoVe on six classification tasks. Furthermore, we investigate the interpretability of our embedding using the logical representations acquired during training. We also visualize word clusters in vector space, demonstrating how our logical embedding co-locate similar words.
translated by 谷歌翻译
Large training data and expensive model tweaking are standard features of deep learning for images. As a result, data owners often utilize cloud resources to develop large-scale complex models, which raises privacy concerns. Existing solutions are either too expensive to be practical or do not sufficiently protect the confidentiality of data and models. In this paper, we study and compare novel \emph{image disguising} mechanisms, DisguisedNets and InstaHide, aiming to achieve a better trade-off among the level of protection for outsourced DNN model training, the expenses, and the utility of data. DisguisedNets are novel combinations of image blocktization, block-level random permutation, and two block-level secure transformations: random multidimensional projection (RMT) and AES pixel-level encryption (AES). InstaHide is an image mixup and random pixel flipping technique \cite{huang20}. We have analyzed and evaluated them under a multi-level threat model. RMT provides a better security guarantee than InstaHide, under the Level-1 adversarial knowledge with well-preserved model quality. In contrast, AES provides a security guarantee under the Level-2 adversarial knowledge, but it may affect model quality more. The unique features of image disguising also help us to protect models from model-targeted attacks. We have done an extensive experimental evaluation to understand how these methods work in different settings for different datasets.
translated by 谷歌翻译
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
translated by 谷歌翻译
When testing conditions differ from those represented in training data, so-called out-of-distribution (OOD) inputs can mar the reliability of black-box learned components in the modern robot autonomy stack. Therefore, coping with OOD data is an important challenge on the path towards trustworthy learning-enabled open-world autonomy. In this paper, we aim to demystify the topic of OOD data and its associated challenges in the context of data-driven robotic systems, drawing connections to emerging paradigms in the ML community that study the effect of OOD data on learned models in isolation. We argue that as roboticists, we should reason about the overall system-level competence of a robot as it performs tasks in OOD conditions. We highlight key research questions around this system-level view of OOD problems to guide future research toward safe and reliable learning-enabled autonomy.
translated by 谷歌翻译
Tsetlin Machine (TM) has been gaining popularity as an inherently interpretable machine leaning method that is able to achieve promising performance with low computational complexity on a variety of applications. The interpretability and the low computational complexity of the TM are inherited from the Boolean expressions for representing various sub-patterns. Although possessing favorable properties, TM has not been the go-to method for AI applications, mainly due to its conceptual and theoretical differences compared with perceptrons and neural networks, which are more widely known and well understood. In this paper, we provide detailed insights for the operational concept of the TM, and try to bridge the gap in the theoretical understanding between the perceptron and the TM. More specifically, we study the operational concept of the TM following the analytical structure of perceptrons, showing the resemblance between the perceptrons and the TM. Through the analysis, we indicated that the TM's weight update can be considered as a special case of the gradient weight update. We also perform an empirical analysis of TM by showing the flexibility in determining the clause length, visualization of decision boundaries and obtaining interpretable boolean expressions from TM. In addition, we also discuss the advantages of TM in terms of its structure and its ability to solve more complex problems.
translated by 谷歌翻译
Automatically estimating 3D skeleton, shape, camera viewpoints, and part articulation from sparse in-the-wild image ensembles is a severely under-constrained and challenging problem. Most prior methods rely on large-scale image datasets, dense temporal correspondence, or human annotations like camera pose, 2D keypoints, and shape templates. We propose Hi-LASSIE, which performs 3D articulated reconstruction from only 20-30 online images in the wild without any user-defined shape or skeleton templates. We follow the recent work of LASSIE that tackles a similar problem setting and make two significant advances. First, instead of relying on a manually annotated 3D skeleton, we automatically estimate a class-specific skeleton from the selected reference image. Second, we improve the shape reconstructions with novel instance-specific optimization strategies that allow reconstructions to faithful fit on each instance while preserving the class-specific priors learned across all images. Experiments on in-the-wild image ensembles show that Hi-LASSIE obtains higher quality state-of-the-art 3D reconstructions despite requiring minimum user input.
translated by 谷歌翻译